fix: mem0 grouped runs lose all groups after the first (telemetry qdrant lock)#27
Merged
Conversation
…ant lock) Root cause (from PR #25's error capture): mem0 with MEM0_TELEMETRY on (its default) opens a qdrant client at a FIXED path (~/.mem0/migrations_qdrant) inside every Memory(). qdrant local mode allows one client per path per process, and the grouped runner keeps the previous provider referenced for version_info, so the lock never releases — first group succeeds, every later group dies with 'Storage folder ... is already accessed'. Matrix v1 lost 24/25 LongMemEval and 30/30 ConvoMem mem0 groups this way; isolated repros passed because reassignment let the old client GC. - MEM0_TELEMETRY defaults to false before the deferred mem0 import (benchmark runs shouldn't emit telemetry anyway); operator can still force-enable. - cleanup() explicitly closes vector_store and _telemetry_vector_store clients and drops the Memory reference, releasing locks even while the instance stays referenced. Verified under the exact failure conditions: three sequential grouped ingests with retained provider references, all OK. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Drew Cain <groksrc@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause (named by PR #25's error capture)
mem0 with
MEM0_TELEMETRYon (default) opens a qdrant client at a fixed path (~/.mem0/migrations_qdrant) inside everyMemory(). qdrant local mode permits one client per path per process; the grouped runner retains the previous provider (forversion_info), so the lock never releases — first group succeeds, all later groups die withStorage folder ... is already accessed. Matrix v1 lost 24/25 LME + 30/30 ConvoMem mem0 groups. Isolated repros passed because reassignment let the old client GC — hence the ghost hunt.Fix
MEM0_TELEMETRY=false(setdefault) before the deferred mem0 import — benchmark runs shouldn't emit telemetry anyway; operators can force-enable.cleanup()explicitly closesvector_store/_telemetry_vector_storeclients and drops the reference, releasing locks even while the instance stays referenced.Verification
Reproduced the exact failure mode (sequential groups, retained references): fails on group 2 before, three groups clean after. 2 new unit tests; suite green, lint clean.
🤖 Generated with Claude Code